VAX1[S79,JMC] - www.SailDart.org

perm filename VAX1[S79,JMC] blob sn#443114 filedate 1979-05-22 generic text, type T, neo UTF8
----Message 27 is----
Date: 10 May 1979
From: Scott Fahlman @ CMUA
Subject: VAX1.TXT

This file is a continuation of VAX.TXT[c380sf50] at CMUA which
was frozen after 26 messages.  Copies of VAX.TXT and VAX1.TXT
exist at MIT-AI as SEF;VAX > and SEF; VAX1 >.
----Message 28 is----
Date:    10 May 1979 1651-EDT
From:    MIKE.KAZAR at CMU-10A 
Subject: Addres space flames

This message is a flame about the way that the huge address space is
managed on Unix and VMS.  Specifically, I am complaining about the
existance of these large core images that get loaded just like in
TOPS-10.

I happen to be a fan of dynamic linking, in case you are wondering
what alternatives there are to the above.  The idea here is that
when any procedure is CALLED, it is located in the file system and
mapped into the address space. The caller remains in the address space
too.  This has the advantage that a call to the
editor from the mailer, for example, looks identical to any other procedure
call in the system.

On Unix, the standard way of doing this is to fork a new process
and tell it what to do. This has two main disadvantages. The first
is that the overhead difference is quite significant. It is a moby
job handling things such as creating a new address space and page tables,
initializing the new process correctly and then going through a protocol
to tell this new process what it is that you want it to do.
I have been told that this takes on the order of a cpu second
on PDP-11 Unix, but have been unable to get an accurate figure
for VAX/Unix.
The second disadvantage is that this is a real non-uniformity in that
we now have two ways of calling a procedure, depending on whether
or not it is supposed to run in the same process. This is really just
aesthetics, since it is not that hard to decide which type of call is
needed, but if/when it is changed, lots of things also have to be re-compiled.

That is another advantage of dynamic linking: when a new version
of anything comes out, nothing has to be re-linked.

It has the disadvantage of making process creations take on the order
of a cpu second. That, at least is about the amount of time that it takes
to link the user and system ring stuff when you log into multics (about
.5 seconds on a KL-10 speed machine).  The further space overhead of having
to provide a place for pure procedures to put OWN variables (either
copy-on-write or multics-style linkage segments) is also a problem,
though not severe.

However, the elegant call mechanism is actually useful for places
that do lots of devlopment of large systems.

A dynamic linker can be added to any system that can map files into
its address space.  However it also requires special object file
formats so that the linker can get all of the necessary information.
Anyway, here is my proposal that we go through the effort to ensure
that the required information is provided in the object files, so that
at laest those of us who wish to use dynamic linking will be able to
use the software that will be developed on whichever system we decide
to go with.

I think that some thought on this is advisable before we end up with either
a system with two completely different methods of procedure invocation, or,
like with TOPS-X, a system where the ability to do anything after typing
↑C is limited to those things which leave the core image intact.
----Message 29 is----
Date: 13 MAY 1979 2004-EDT
From: WNJ at MIT-AI (William N. Joy)
To: SEF at MIT-AI
Via:     MIT-AI; 13 May 1979 2006-EDT

Some comments on dynamic linking:

1. Creating processes on UNIX doesnt take anywhere near a second of CPU time
for small processes.  Remember that NO LINK EDITING is needed, so that 
creating a process is simply a matter of loading it into core and starting it.

2. UNIX loses on sharing in that many processes will have common copies of
routines such as the formatted print routines.  But there is LESS overhead in
starting these routines in that the system does not have to scour them up or
fault the first time they are referenced.  This is not to say that the UNIX
scheme is better than dynamic linking - it is not, but it is important
to point out that certain kinds of overhead are low.

3. Created processes have more context than does a procedure
which is running in the same address space as the caller.
That is, a called procedure has a full I/O context, more
control over signals, alarm clocks and the like.  It is often the case
that a created process needs the freedom to muck up this context.


It would be nice to have a system with a uniform calling mechanism.
There was a project within DEC to implement such a system (Jim Hamilton
gave a talk at Berkeley about it) but it was cancelled.  I dont think
I am convinced that such uniformity is feasible in this architecture without
a significant time penalty (even with microcode assist).

In the absence of a uniform calling mechanism, the combination of a
good dynamic linking facility with a fast process creation facility seems
a reasonable alternative.
----Message 30 is----
Date: 15 MAY 1979 1616-EDT
From: LB at MIT-AI (Leonard Bosack)
Subject: -20  ADDRESS SPACE
To: RWK at MIT-AI
CC: SEF at MIT-AI, RMS at MIT-AI
Via:     MIT-AI; 15 May 1979 1650-EDT


If you are basing your statement "... the PDP-10/20 architecture
CAN NOT fill the needs of the research community in the future"
on address space size, I think you may have incomplete information.

The "extended" -20 addressing provides a linear, homogeneous
30 bit virtual address space for both Kernal and User programs.
Considering this extension was made almost 10 years after the PDP-6,
it is surprising that it is almost crock-free. I think LISP would
have no problems dealing with it (or any other language I know
something about the insides of, even people  writing assembler
code seem to  have good luck ). Details of how it works can be
found in the 10/20  System Reference Manual (DEC Order # EK-10/20-HR-001),
which has been available since Feb. 1978.

There exist many reasons why one might prefer VAX (VMS even) to the
-20, but address space is not among them.
----Message 31 is----
Date: 16 May 1979 1538-PDT (Wednesday)
From: mark at UCLA-Security (Mark Kampe)
Subject: VAX/VMS vs Unix/32
To: fahlman at cmua
Via:     UCLA-SECURI; 16 May 1979 1918-EDT

At Interactive, we do a great deal of development work on Unix.  We are also
building a Unix-like environment on VMS.  Most of my experience has been with
the internals of Unix and Unix-variants, but we also have several people who
are quite into VMS.  Hopefully, some of them will start making contributions
to the dialog too.

Most of the considerations which we have encountered in weighing the two 
approaches to a system on the VAX have been well brought out and I can, at
this time, think of only a few items to add.

Interprocess communication:
    Unix is indeed somewhat week in this respect.  Unix signals are not a
good event mechanism because they communicate too little information and
can not be properly recovered from (rigorously speaking).  The only synch-
ronization mechanism is lockfiles which are awkward and outdated.  These
deficiencies are present in Bell's Unix  -  but are not intrinsic in the
system.

    There is, available from UCLA, and from Interactive a device driver which
implements full semaphores, both private and public.  No changes are required
to the system to incorporate this device driver.  The implementation is very
efficient and very convenient to use.

    The CAC group at the University of Illinois developed a fairly simple
implementation of an event mechanism which is also quite general and easy
to use.  These events have been used (by the CAC group) to implement fully
asynchronous I/O directly available to the users.  (as has been observed
earlier - most I/O is really asynchronous anyway - inside the system).

File formats and integrity:
    I just want to reiterate what a few other people have said.  Years ago
Unix got alot of bad press for having a fragile file system.  The reason that
Unix's file system seemed fragile is because Unix tries to do as little I/O
as possible, and tries to maintain a buffer cache in core.  The reward for this
is that Unix can outperform other DEC systems for file-I/O throughput - by
use of very clever scheduling of I/O.  The penalty is that if the system is
halted suddenly, there may be alot of I/O that hadn't happened yet.
    This is, however a non-problem.  The people who reported a 60 hour mtbf
are running very unreliable Unix systems.  At Interactive (where we have
a clean power unit) our system stays up - literally a month at a time.  It
is taken down only for PM and occasional system testing.  Moreover, in 98%
of all problems that customers report (usually caused by experimental software,
or a jerk who likes to push buttons on the console) the ONLY damage is
missing free space.  This can be detected and corrected automatically.
    We do alot of system testing, and we have a problem that calls for
operator (not guru, but operator) intervention a few times a year.  I don't
think that justifies the atribution "fragile".

    With regard to formats, we have encountered a major problem in VMS which
I have not heard mentioned.  RMS imposes a large number of inefficiencies and
unreasonable restrictions on the users of VMS.  One of them is the existance
of two (actually there are more) highly different file formats on the system
(binary and text).  Binary files are random access byte streams, while text
files are more structured.  One must know what type of file he is about to
read before he can open it - and RMS won't automatically do the "right thing".

    People used to a Unix environment where the output to one program is
often just passed as the input to another are scr*wed by this - since the
decision of whether to use text format files or binary format files pre-
cludes the ability to handle files generated by a program which uses the
other.  To suggest that all data is either executable core images, or
lines of ascii text is quite narrow minded.  It takes away much of the
flexibility that Unix user's are used to.

Virtual address space and addressing:
    The ability to map pages of a file directly into a user's address space
is a very important one - as anyone who has had that ability and moved
to a system which did not knows.  I suffered that pain when we moved to
Unix (so many years ago).  With that ability comes a very efficient mech-
anism for sharing subroutines - one which VMS supports.  Also, the current
Bell Unix/32 is a swapping system - clearly a mistake on a machine such as
the VAX.
    The fact that these features do not presently exist in Unix (ala Bell)
does not mean they are hard to add.  We at Interactive, have provided a
customer with a simple system call to map segments of a file into his
address space.  The addition took less than a days work.  The particular
solution we chose was non-optimal in that a proper solution would have
used the same paging mechanism that the rest of the system used (if it
were paged).  The point here is that the fact that Bell didn'T put an
important feature into Unix does not mean that Unix is not a willing
vehical for such features.
    Paging is an important issue -  the VAX should run a paging system.
That is why several groups are starting work on a paging system.  My
experience with the Unix interface for Data Secure Unix, demonstrated
that Unix can easily be converted to be an efficient paging system :
with very little change.  Ask any unix-jock what the block I/O subsystem
is ... it is a paging system - on which almost all Unix I/O is based.
I firmly believe that there will be a good paging Unix available within
a year or less.  There are people who want to work on one, and people
who want to buy one - the invisible hand dictates that it will be built.

Maintainability:
    Don't consider Bell in the support picture.  Bell is under a consent
decree which does not include developing/selling operating systems.  They
have developed Unix as an internal research vehical and platform for appl-
cations within the operating companies.  They are marketing this internal
project because people are willing to pay for it.  They have an interest
in improving it, but no commtment.

    Don't consider DEC in the support picture.  They are a large organization
which is aiming at a large segment of the computer market.  The C.S. research
oriented ARPA community is not at the center of that market.  DEC will 
continue to improve the system, to develop languages and DBMSs and stuff
like that.  They will not provide the wealth of software and specialized
packages which the ARPA community will demand.  They cannot provide the
amazing amount of software which exists for Unix.  Whichever system, or
hybrid is selected - it will be changed so much to meet the needs of the
ARPA community that no vendor will take any responsibility for it - unless
it is a group specially chartered to care for the system.

    Our experience has shown that Unix is very easy to understand, and
easy to maintain.  An experienced system's person can read all of the
Unix Kernel in the time that he would invest in two VMS device drivers.
Unix's elegance and maintainability are proven.  DEC's choice of writing
an operating system in assembler (when they had BLISS) proves alot too.
Given the number of changes which the ARPA community will want to make to
the system - the issue of maintainability is a crucial one.  

sorry for the soap box,
---mark---
-------
----Message 32 is----
Date: 17 MAY 1979 0009-EDT
From: RMS at MIT-AI (Richard M. Stallman)
To: SEF at MIT-AI
Via:     MIT-AI; 17 May 1979 0009-EDT

I think Kazar is right about dynamic linking.  It is something
that should certainly be persued if there is any opportunity.
I'm not certain though that it is even possible on the VAX;
the cretinous lack of any way to continue an instruction
after certain sorts of traps may screw it.  What is necessary is
to be able to set up a reference which will be dynamically linked
so that 1) it is guaranteed to cause a page fault, 2) from it there
is some way to find a string containing the name of the symbol
to link it to, 3) when that symbol is looked up there is some way to
alter the reference or the instruction containing it to point at the
symbol's address, and 4) the instruction can be continued or restarted
somehow.  I don't remember the VAX architecture well enough to know
off hand whether there is any way to do this.  My idea would be:
fill teh instruction with an indirect address and the indirect word
with an address in a page that "exists" but is always "swapped out".
The name of the symbol appears in the bytes following the indirect word.
Thus it can be found from the instruction (I hope) or from the data
saved by the page fault (I hope).  After the symbol value is found it
is put in the indirect word, or else the instruction is clobbered to
contain it.  Then the instruction is continued, with perhaps some
alteration to the data stored by the page fault.  Or else, it is started
again, which might require that only certain sorts of instructions
be allowed to point at dynamic links.  Such a restriction would probably
not be very expensive, since almost all references will come from
subroutine call instructions, which should be no problem, and anything
which is a problem can be got around by using an extra instruction.

Maybe someone who knows more details can figure out whether this will
actually win.  If it does, the project is certainly feasible, since
dynamic linking has been implemented successfully on a machine which
was not designed for it, at the Architecture machine group (where Kazar
used to be).  It wasn't all that hard, and I don't think adding it to
an existing system is really hard either.
----Message 33 is----
Date: 17 MAY 1979 1908-PDT
From: JMETZGER at USC-ISIB
Subject: Re: Vax Operating Systems
To:   Scott.Fahlman at CMU-10A
cc:   JMETZGER
Via:     USC-ISIB; 17 May 1979 2209-EDT

In response to your message sent  16 May 1979 2313-EDT

I would take slight exception to Mr. Bosacks message about the "crock
free" extended addressing to the 10s and 20s. The architecture does allow
for 30 bits of addressing, but there are only 23 bits implemented in the
current kl10s. An instruction can still only address 256k (18 bits)
directly, to get to the rest of the virtual address space beyond the 
locally accessible 256k indexing must be used. There is no way to
cross 256k "sections" except to jump there (that is the PC wraps around
within the locally accessible 256k virtual space). If an instruction is
executed at virtual x,,777777 (octal) then the next instruction will be
x,,0 (where x is between 0 and 37 octal), rather than the more expected 
x+1,,0.  However a program that can't be
easily divided up into 256k sections of code might be accused of not being
very modular. Essentially section zero is wasted since once the pc or an
indirect address calculation enters section 0 it can never get out (except
via a jump instruction). The only practical thing to do when trying to
use the extended addressing features is to ignore section 0 or map it
the same as section 1 (which is what release 3a of tops20 does, this is
only in the monitors address space since extended addressing is not
available to user in tops20, yet).

However the extensions dec made to the architecture are not all that bad,
given the constraints they're pretty good. The simple thing of making
section 0 not special at all simply won't work (the proof is left to
the reader). One thing that could have been done to make life a little
easier for the user is to have a new PC bit (or a bit some place in the
users state such as the upt) which indicates that the process is either
an old style non-extended 256k virtual address space pdp10 or a new
extended virtual address space pdp10 (without some of the special meaning
to section 0, specifically section 0 could address a non zero section
just as any non zero section could address a global section).

However I thought this debate was about which system was best for the future
not which piece of hardware was best. Unix has already been ported to at
least 4 and maybe 5 different architectures (pdp11, interdata 8/32,
honeywell level 6, vax 11/780, even rumors about an IBM series 1).
I belive any system that is chosen should be independent of any particular
manufacture's hardware. Hardware is getting dirt cheap and it is the software
capabilities that are the most important to consider. As long as the hardware
has enough virtual address space and reasonable speed and reasonable
peripheral devices who cares who makes it? The only thing users will see is
the software interface to the hardware. The question is how well (easily?)
can the system implementation language be mapped into the instructions set.
 

As to which software system is "better" it's very hard to say, both vms and
unix have a lot of good and a lot of bad (and out right dumb) features.
This debate reminds me a lot of the debate over which operating system
was best for the pdp10s in the very early 70s. At that time there were a
number of deficiencies in tops10 (and those same deficiencies are still
there for the most part although the system has evolved a lot). Tenex was
just being written and seemed to have a lot of good features, but lacked
maturity (a reliable file system, lots of supported software etc. same
things I've heard about unix vs vms). The mit its system (was it called mac
in those days?) was considered briefly in that debate too. Just for a bit
of history the debate went like this: we have three options, use dec's tops10
and extend it for our needs (or convince dec to do it), 2) adopt Tenex or the
mit-mac system or 3) write our own from scratch. It went on through the need
for reliability (this was to be service system not an experimental monitor),
for dec compatibility, flexibility and ease of modification so each site could
add their very own weird devices and eccentric style of operation. There had to
be processes with interrupts and interprocess communication. The file
system had to be memory mapped page oriented as well as byte at a time i/o.
Device independent i/o was desirable, protection mechanisms that allowed
easy page sharing as well as protection of sharing had to exist.
CPU scheduling was a big issue. Tenex eventually won out from whence came 
tops20. The more things change the more they stay the same, there was and still
is a great deal of passion in the debate over the "best" o.s. for the 10.
(bottoms-10, twenex, ???). Suffice it to say each system (tops10,
tops20, tenex, its ...) has capabilities and advantages (as well as 
disadvantages) the others don't. Each is useful. Each has very useful tools.
So it is with the vax operating systems.

It is my belief that the days of large timesharing systems (like tops20)
are drawing to a close. What is needed instead is a small, personal machine
to develop software (and hardware). You don't have to (or want to) develop
software on the same machine that it will end up running on. you need good
text editors, compiler, debuggers, simulators and other software tools
(like the programmers workbench concept of unix/bell labs) as well as a
good automated hardware design lab (like suds only more). If the piece of 
hardware is not around to run our special ai application then just know it
out on the cad system, burn a few proms, slap a few 2901 chips together
and presto; just what you needed. Why ride a bus when everyone can have
there own little computer. We are going to see networks of special
purpose (program generation) computers tied to large special purpose
computers (array processors, floating point processors, lisp processors etc.).
 

No matter which way you go trying to make vms or unix be all things to all 
people will be a difficult if not impossible job (talk to the bbn tenex people
about that one). I am really surprised there is so much concentration
on such classical timesharing systems (as vms or unix) at cmu, home of the
hydra, c.mmp, c* .... How about if we build a VM780 (similarly in concept to
VM/370) for vax that can run VMS,Unix,RSX,RT,IAS,Elf,Epos (any other 11 o.s.es
around?) as well as emulate 370s, 1108s and pdp10s (which means we can run
dos, os, ... tops10, tops20, its, ... all at the same "time") and connect it
to a chaos/either/packet radio/arpa/tcp/gateway/internet surrounded by a
satellite system of 11/70s running unix or rsx (for real) which have 
super intelligent word processing, screen editing terminals (say 66lines by 160
characters across minimum, full color bit map displays) with built into the
terminal interpreters for lisp/sail/pascal/c/pl1/snobol/cobol/fortran/apl/
bliss/bcpl/algol (60 & 68 of course!)/chili (what chili?? grand bable prize
to anyone that knows what that one is). Note that basic is missing from the
list, I figure with all that language power who really needs basic anyway?
This would all run on 8086/z8000/ motorol 68000 chip, micro coded to handle
multiple virtual extended addressing (with dynamic linking and access 
protection) with non-rotating hierarchical bubble disk replacement file system
with bi-synchronous i/o! you name it you got it!! but i don't even want to
hear about it!!!  Or in the words of the immortal John Walker 
"if you can't make it work get a bigger hammer" Let's write the next os in APL!


Seriously I hope the mistakes that were made with the Tenex decision aren't
repeated this time. I don't belive the decision to develop Tenex was a 
mistake at all, rather not giving the guys at BBN enough time to design and
develop a truly clean and elegant system was the real problem. The 
pressure to get something out to satisfy the users did more harm to tenex than
any other single cause. The only reason that Unix seems so clean is that it
was developed by two guys for their own personnel use in there spare time.
There was no pressure to get it done by next quarter or have so much progress
by next week. They only put into unix what they felt was need or useful,
always trying to not make it bigger, no hacking programs to death, the bare
bones minimum of utility. They simply didn't have the time to clutter it up
or the pressure to add half baked features.
Those conditions never exist in a commercial environment and
rarely in a government funded project (after all you still have to produce
something to give arpa even if it is junk). It does seem that no matter
which way arpa goes (unix, vms or tops) there will be a great deal of
changing done. I would suggest unix as a base only because it's style and
philosophy are one's that I agree with whole heartedly. Small is beautiful,
more is not better and big is ugly. The central theme of avoiding programming
and then if you must only as much as is needed seems to be very desirable.
The listing of the unix kernel I have is less than 3 inches of paper (which
included the arpa net code, which is about half of it due to a lot of white
space). The thought of vms taking five (5) 2400 foot reels of tape just boggles
my mind, who can ever be familiar with that much code, let alone understand it?

What I would suggest is that arpa or someone fund a study and design of an
future o.s. for the users long term needs. This design effort should be
funded for at least 3 years without there ever being one line of code written.
I would hope that it would start by an examination of all the major t.s.
systems around today (tops20, tops10, its, multics, unix, vms, rsx, tso (yes,
tso we all need bad examples too!), Data General's aos (which I'm told is
a lot like unix). A study of system implementation languages is needed
(neither c, pascal of bliss is quite right). The language should be designed
without regard for which hardware it will eventually run on, it should be
designed to support programmers.  This also implies a study of currently
available machine architectures in terms of how well they could support this
language and how well they can support the o.s. (interrupts, virtual memory,
context switching etc).
At the end of that time it should only take about 6 months to a year to
get the first cut at the system up and another year or two to polish it up.
That sounds like a hefty investment, but I belive for the return it would
give it is well worth it. (if such a thing does happen tell me were to send
my resume)

John Metzger
-------